49 research outputs found

    Broad phonetic class definition driven by phone confusions

    Get PDF
    Intermediate representations between the speech signal and phones may be used to improve discrimination among phones that are often confused. These representations are usually found according to broad phonetic classes, which are defined by a phonetician. This article proposes an alternative data-driven method to generate these classes. Phone confusion information from the analysis of the output of a phone recognition system is used to find clusters at high risk of mutual confusion. A metric is defined to compute the distance between phones. The results, using TIMIT data, show that the proposed confusion-driven phone clustering method is an attractive alternative to the approaches based on human knowledge. A hierarchical classification structure to improve phone recognition is also proposed using a discriminative weight training method. Experiments show improvements in phone recognition on the TIMIT database compared to a baseline system

    Phoneme Recognition on the TIMIT Database

    Get PDF

    Anålise eståtica de vigas em materiais compósitos usando computação simbólica

    Get PDF
    Na presente dissertação estuda-se o comportamento mecĂąnico em vigas construĂ­das em materiais compĂłsitos, abordando diferentes teorias de base. Foram desenvolvidos dois elementos Lagrangeanos de viga-barra, baseados na teoria ao corte de primeira ordem e na teoria ao corte de ordem superior, quadrĂĄticos e cĂșbicos, respectivamente. Os modelos foram implementados na aplicação de computação simbĂłlica MAPLE e comparados com soluçÔes alternativas. Foram realizados estudos em anĂĄlise estĂĄtica linear e nesse contexto foi estudada a influĂȘncia da variação dos Ăąngulos de orientação das fibras e do nĂșmero de camadas de empilhamento do laminado. Foi iniciado um estudo sobre o comportamento de materiais FGM

    Mispronunciation Detection in Children's Reading of Sentences

    Get PDF
    This work proposes an approach to automatically parse children’s reading of sentences by detecting word pronunciations and extra content, and to classify words as correctly or incorrectly pronounced. This approach can be directly helpful for automatic assessment of reading level or for automatic reading tutors, where a correct reading must be identified. We propose a first segmentation stage to locate candidate word pronunciations based on allowing repetitions and false starts of a word’s syllables. A decoding grammar based solely on syllables allows silence to appear during a word pronunciation. At a second stage, word candidates are classified as mispronounced or not. The feature that best classifies mispronunciations is found to be the log-likelihood ratio between a free phone loop and a word spotting model in the very close vicinity of the candidate segmentation. Additional features are combined in multi-feature models to further improve classification, including: normalizations of the log-likelihood ratio, derivations from phone likelihoods, and Levenshtein distances between the correct pronunciation and recognized phonemes through two phoneme recognition approaches. Results show that most extra events were detected (close to 2% word error rate achieved) and that using automatic segmentation for mispronunciation classification approaches the performance of manual segmentation. Although the log-likelihood ratio from a spotting approach is already a good metric to classify word pronunciations, the combination of additional features provides a relative reduction of the miss rate of 18% (from 34.03% to 27.79% using manual segmentation and from 35.58% to 29.35% using automatic segmentation, at constant 5% false alarm rate).info:eu-repo/semantics/publishedVersio

    WORKSHOP NACIONAL “INVESTIGAÇÃO EM OLIVICULTURA E AZEITE - RESULTADOS E APLICAÇÕES

    Get PDF
    Workshop Nacional “Investigação em Olivicultura e Azeite – Resultados e AplicaçÔes” De 27-06-2013, 09:00 a 28-06-2013 Sala de ConferĂȘncias do PĂłlo da Mitra da Universidade de Évora O Workshop Nacional dedicado ao tema “Investigação em Olivicultura e Azeite – Resultados e AplicaçÔes” realizou-se nos dias 27 e 28 de junho de 2013. Pretendeu-se com este evento, dirigido a produtores, tĂ©cnicos, e todos os interessados, divulgar os resultados obtidos no Ăąmbito de projetos de investigação em curso em Portugal na ĂĄrea da Olivicultura e Azeite, com especial ĂȘnfase nas potenciais aplicaçÔes e contribuiçÔes para o desenvolvimento do setor. As sessĂ”es incidiram nos seguintes temas Sistemas e TĂ©cnicas Culturais Recursos GenĂ©ticos e Melhoramento Proteção FitossanitĂĄria Tecnologia e Qualidade do Azeite ApresentaçÔes Tema 1 - Sistemas e TĂ©cnicas Culturais MĂșltiplos olhares sobre a transpiração do olival - Francisco LĂșcio dos Santos GestĂŁo de cobertos vegetais de leguminosas anuais de ressementeira natural em olival - Manuel Ângelo Rodrigues Mecanização do Olival - soluçÔes e novos desafios - Arlindo de Almeida MĂĄquina para a Colheita em ContĂ­nuo de Azeitona - AntĂłnio Bento Dias InfluĂȘncia de diferentes regimes hĂ­dricos no uso e na eficiĂȘncia do uso da ĂĄgua, produção e qualidade do azeite - Anabela Silva Medidas de prevenção e mitigação dos impactos da seca no olival de sequeiro: efeitos da rega sustentĂĄvel e de coberturas vegetais - Eunice Bacelar Contribuição do COTR para o uso eficiente da ĂĄgua na rega do olival - LuĂ­s Boteta Tema 2 - Recursos GenĂ©ticos e Melhoramentos A diversidade genĂ©tica da mosca-da-azeitona na bacia mediterrĂąnica - LuĂ­s Teixeira Evolução dos primeiros estados fenolĂłgicos em oliveira - diversidade intervarietal e alteraçÔes climĂĄticas - AntĂłnio Cordeiro Variabilidade genĂ©tica e quĂ­mica - implicaçÔes na rastreabilidade de azeites - Paula Lopes Tema 3 - Protecção FitossanitĂĄria A elevada incidĂȘncia de vĂ­rus em olivais nacionais: causas e consequĂȘncias - RosĂĄrio FĂ©lix Como proceder para fertilizar racionalmente o olival - Pedro JordĂŁo Avaliação de GenĂłtipos de Olea europaea vs Infecção de Colletotrichum acutatum - Teresa Carvalho Isolamento e seleção de fungos endofĂ­ticos da oliveira para luta biolĂłgica contra Colletotrichum acutatum e Verticillium dahliae - Paula Batista A utilização de indicadores biolĂłgicos como ferramentas para avaliar o impacte de prĂĄticas agrĂ­colas na sustentabilidade do olival - SĂłnia Santos Proteção contra pragas da oliveira: fomento da ação dos inimigos naturais pelo estabelecimento da flora autĂłctone - Albino Bento Fungos entomopatogĂ©nicos em pragas da oliveira: isolamento, caracterização e selecção para controlo biolĂłgico - Paula Batista Tema 4 - Tecnologia e Qualidade Redefinição da Denominação de Origem Protegida “Azeite de TrĂĄs-os-Montes” e criação da Denominação de Origem Protegida “Azeite do Douro” - Ricardo Malheiro Azeitonas de mesa do nordeste de Portugal - contributo para a sua caraterização e promoção - Nuno Rodrigues ÁCIDOS GORDOS E POLIFENÓIS EM AZEITE VIRGEM - Isabel Baer InfluĂȘncia da rega na produção e qualidade de azeites produzidos em olivais intensivos (cv. Cobrançosa) e em olivais em sebe (cv. Arbequina) - Mariana Mota Efeito do processamento culinĂĄrio na composição nutricional e quĂ­mica de azeites portugueses - Susana Casal A Arte do Azeite - Ana Carrilho O Azeite PortuguĂȘs na Economia Global: Oportunidades e desafios - Teresa Zacarias ComissĂŁo Organizadora: AdĂ©lia Sousa, Grupo ASC do ICAAM AntĂłnio Bento Dias, Grupo CTV do ICAAM Fernando Rei, Grupo CTV do ICAAM Francisco LĂșcio dos Santos, Grupo ASC do ICAAM-coordenador Raquel Garcia, Grupo CTV do ICAAM Renato Coelho, Grupo CTV do ICAAM Joana PerdigĂŁo, UDIT-ICAA

    ALBAYZIN Query-by-example Spoken Term Detection 2016 evaluation

    Full text link
    [EN] Query-by-example Spoken Term Detection (QbE STD) aims to retrieve data from a speech repository given an acoustic (spoken) query containing the term of interest as the input. This paper presents the systems submitted to the ALBAYZIN QbE STD 2016 Evaluation held as a part of the ALBAYZIN 2016 Evaluation Campaign at the IberSPEECH 2016 conference. Special attention was given to the evaluation design so that a thorough post-analysis of the main results could be carried out. Two different Spanish speech databases, which cover different acoustic and language domains, were used in the evaluation: the MAVIR database, which consists of a set of talks from workshops, and the EPIC database, which consists of a set of European Parliament sessions in Spanish. We present the evaluation design, both databases, the evaluation metric, the systems submitted to the evaluation, the results, and a thorough analysis and discussion. Four different research groups participated in the evaluation, and a total of eight template matching-based systems were submitted. We compare the systems submitted to the evaluation and make an in-depth analysis based on some properties of the spoken queries, such as query length, single-word/multi-word queries, and in-language/out-of-language queries.This work was partially supported by Fundacao para a Ciencia e Tecnologia (FCT) under the projects UID/EEA/50008/2013 (pluriannual funding in the scope of the LETSREAD project) and UID/CEC/50021/2013, and Grant SFRH/BD/97187/2013. Jorge Proenca is supported by the SFRH/BD/97204/2013 FCT Grant. This work was also supported by the Galician Government ('Centro singular de investigacion de Galicia' accreditation 2016-2019 ED431G/01 and the research contract GRC2014/024 (Modalidade: Grupos de Referencia Competitiva 2014)), the European Regional Development Fund (ERDF), the projects "DSSL: Redes Profundas y Modelos de Subespacios para Deteccion y Seguimiento de Locutor, Idioma y Enfermedades Degenerativas a partir de la Voz" (TEC2015-68172-C2-1-P) and the TIN2015-64282-R funded by Ministerio de Economia y Competitividad in Spain, the Spanish Government through the project "TraceThem" (TEC2015-65345-P), and AtlantTIC ED431G/04.Tejedor, J.; Toledano, DT.; Lopez-Otero, P.; Docio-Fernandez, L.; Proença, J.; PerdigĂŁo, F.; GarcĂ­a-Granada, F.... (2018). ALBAYZIN Query-by-example Spoken Term Detection 2016 evaluation. EURASIP Journal on Audio, Speech and Music Processing. 1-25. https://doi.org/10.1186/s13636-018-0125-9S125Jarina, R, Kuba, M, Gubka, R, Chmulik, M, Paralic, M (2013). UNIZA system for the spoken web search task at MediaEval 2013. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 791–792).Ali, A, & Clements, MA (2013). Spoken web search using and ergodic hidden Markov model of speech. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 861–862).Buzo, A, Cucu, H, Burileanu, C (2014). SpeeD@MediaEval 2014: Spoken term detection with robust multilingual phone recognition. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 721–722).Caranica, A, Buzo, A, Cucu, H, Burileanu, C (2015). SpeeD@MediaEval 2015: Multilingual phone recognition approach to Query By Example STD. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 781–783).Kesiraju, S, Mantena, G, Prahallad, K (2014). IIIT-H system for MediaEval 2014 QUESST. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 761–762).Ma, M, & Rosenberg, A (2015). CUNY systems for the Query-by-Example search on speech task at MediaEval 2015. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 831–833).Takahashi, J, Hashimoto, T, Konno, R, Sugawara, S, Ouchi, K, Oshima, S, Akyu, T, Itoh, Y (2014). An IWAPU STD system for OOV query terms and spoken queries. In Proc. of NTCIR-11. National Institute of Informatics, Tokyo, (pp. 384–389).Makino, M, & Kai, A (2014). Combining subword and state-level dissimilarity measures for improved spoken term detection in NTCIR-11 SpokenQuery & Doc task. In Proc. of NTCIR-11. National Institute of Informatics, Tokyo, (pp. 413–418).Konno, R, Ouchi, K, Obara, M, Shimizu, Y, Chiba, T, Hirota, T, Itoh, Y (2016). An STD system using multiple STD results and multiple rescoring method for NTCIR-12 SpokenQuery & Doc task. In Proc. of NTCIR-12. National Institute of Informatics, Tokyo, (pp. 200–204).Sakamoto, N, Yamamoto, K, Nakagawa, S (2015). Combination of syllable based N-gram search and word search for spoken term detection through spoken queries and IV/OOV classification. In Proc. of ASRU. IEEE, New York, (pp. 200–206).Hou, J, Pham, VT, Leung, C-C, Wang, L, 2, HX, Lv, H, Xie, L, Fu, Z, Ni, C, Xiao, X, Chen, H, Zhang, S, Sun, S, Yuan, Y, Li, P, Nwe, TL, Sivadas, S, Ma, B, Chng, ES, Li, H (2015). The NNI Query-by-Example system for MediaEval 2015. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 141–143).Vavrek, J, Viszlay, P, Lojka, M, Pleva, M, Juhar, J, Rusko, M (2015). TUKE at MediaEval 2015 QUESST. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 451–453).Mantena, G, Achanta, S, Prahallad, K (2014). Query-by-example spoken term detection using frequency domain linear prediction and non-segmental dynamic time warping. IEEE/ACM Transactions on Audio, Speech and Language Processing, 22(5), 946–955.Anguera, X, & Ferrarons, M (2013). Memory efficient subsequence DTW for query-by-example spoken term detection. In Proc. of ICME. IEEE, New York, (pp. 1–6).Tulsiani, H, & Rao, P (2015). The IIT-B Query-by-Example system for MediaEval 2015. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 341–343).Bouallegue, M, Senay, G, Morchid, M, Matrouf, D, Linares, G, Dufour, R (2013). LIA@MediaEval 2013 spoken web search task: An I-Vector based approach. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 771–772).Rodriguez-Fuentes, LJ, Varona, A, Penagarikano, M, Bordel, G, Diez, M (2013). GTTS systems for the SWS task at MediaEval 2013. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 831–832).Wang, H, Lee, T, Leung, C-C, Ma, B, Li, H (2013). Using parallel tokenizers with DTW matrix combination for low-resource spoken term detection. In Proc. of ICASSP. IEEE, New York, (pp. 8545–8549).Wang, H, & Lee, T (2013). The CUHK spoken web search system for MediaEval 2013. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 681–682).Proenca, J, Veiga, A, PerdigĂŁo, F (2014). The SPL-IT query by example search on speech system for MediaEval 2014. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 741–742).Proenca, J, Veiga, A, Perdigao, F (2015). Query by example search with segmented dynamic time warping for non-exact spoken queries. In Proc. of EUSIPCO. Springer, Berlin, (pp. 1691–1695).Proenca, J, Castela, L, Perdigao, F (2015). The SPL-IT-UC Query by Example search on speech system for MediaEval 2015. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 471–473).Proenca, J, & Perdigao, F (2016). Segmented dynamic time warping for spoken Query-by-Example search. In Proc. of Interspeech. ISCA, Baixas, (pp. 750–754).Lopez-Otero, P, Docio-Fernandez, L, Garcia-Mateo, C (2015). GTM-UVigo systems for the Query-by-Example search on speech task at MediaEval 2015. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 521–523).Lopez-Otero, P, Docio-Fernandez, L, Garcia-Mateo, C (2015). Phonetic unit selection for cross-lingual Query-by-Example spoken term detection. In Proc. of ASRU. IEEE, New York, (pp. 223–229).Saxena, A, & Yegnanarayana, B (2015). Distinctive feature based representation of speech for Query-by-Example spoken term detection. In Proc. of Interspeech. ISCA, Baixas, (pp. 3680–3684).Skacel, M, & Szöke, I (2015). BUT QUESST 2015 system description. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 721–723).Chen, H, Leung, C-C, Xie, L, Ma, B, Li, H (2016). Unsupervised bottleneck features for low-resource Query-by-Example spoken term detection. In Proc. of Interspeech. ISCA, Baixas, (pp. 923–927).Yuan, Y, Leung, C-C, Xie, L, Chen, H, Ma, B, Li, H (2017). Pairwise learning using multi-lingual bottleneck features for low-resource Query-by-Example spoken term detection. In Proc. of ICASSP. IEEE, New York, (pp. 5645–5649).Torbati, AHHN, & Picone, J (2016). A nonparametric Bayesian approach for spoken term detection by example query. In Proc. of Interspeech. ISCA, Baixas, (pp. 928–932).Popli, A, & Kumar, A (2015). Query-by-example spoken term detection using low dimensional posteriorgrams motivated by articulatory classes. In Proc. of MMSP. IEEE, New York, (pp. 1–6).Yang, P, Leung, C-C, Xie, L, Ma, B, Li, H (2014). Intrinsic spectral analysis based on temporal context features for query-by-example spoken term detection. In Proc. of Interspeech. ISCA, Baixas, (pp. 1722–1726).George, B, Saxena, A, Mantena, G, Prahallad, K, Yegnanarayana, B (2014). Unsupervised query-by-example spoken term detection using bag of acoustic words and non-segmental dynamic time warping. In Proc. of Interspeech. ISCA, Baixas, (pp. 1742–1746).Hazen, TJ, Shen, W, White, CM (2009). Query-by-example spoken term detection using phonetic posteriorgram templates. In Proc. of ASRU. IEEE, New York, (pp. 421–426).Abad, A, Astudillo, RF, Trancoso, I (2013). The L2F spoken web search system for mediaeval 2013. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 851–852).Szöke, I, SkĂĄcel, M, Burget, L (2014). BUT QUESST 2014 system description. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 621–622).Szöke, I, Burget, L, GrĂ©zl, F, ČernockĂœ, JH, Ondel, L (2014). Calibration and fusion of query-by-example systems - BUT SWS 2013. In Proc. of ICASSP. IEEE, New York, (pp. 621–622).Abad, A, RodrĂ­guez-Fuentes, LJ, Penagarikano, M, Varona, A, Bordel, G (2013). On the calibration and fusion of heterogeneous spoken term detection systems. In Proc. of Interspeech. ISCA, Baixas, (pp. 20–24).Yang, P, Xu, H, Xiao, X, Xie, L, Leung, C-C, Chen, H, Yu, J, Lv, H, Wang, L, Leow, SJ, Ma, B, Chng, ES, Li, H (2014). The NNI query-by-example system for MediaEval 2014. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 691–692).Leung, C-C, Wang, L, Xu, H, Hou, J, Pham, VT, Lv, H, Xie, L, Xiao, X, Ni, C, Ma, B, Chng, ES, Li, H (2016). Toward high-performance language-independent Query-by-Example spoken term detection for MediaEval 2015: Post-evaluation analysis. In Proc. of Interspeech. ISCA, Baixas, (pp. 3703–3707).Xu, H, Hou, J, Xiao, X, Pham, VT, Leung, C-C, Wang, L, Do, VH, Lv, H, Xie, L, Ma, B, Chng, ES, Li, H (2016). Approximate search of audio queries by using DTW with phone time boundary and data augmentation. In Proc. of ICASSP. IEEE, New York, (pp. 6030–6034).Oishi, S, Matsuba, T, Makino, M, Kai, A (2016). Combining state-level and DNN-based acoustic matches for efficient spoken term detection in NTCIR-12 SpokenQuery &Doc-2 task. In Proc. of NTCIR-12. National Institute of Informatics, Tokyo, (pp. 205–210).Oishi, S, Matsuba, T, Makino, M, Kai, A (2016). Combining state-level spotting and posterior-based acoustic match for improved query-by-example spoken term detection. In Proc. of Interspeech. ISCA, Baixas, (pp. 740–744).Obara, M, Kojima, K, Tanaka, K, Lee, S-w, Itoh, Y (2016). Rescoring by combination of posteriorgram score and subword-matching score for use in Query-by-Example. In Proc. of Interspeech. ISCA, Baixas, (pp. 1918–1922).NIST. The Ninth Text REtrieval Conference (TREC 9). http://trec.nist.gov . Accessed Feb 2018.Anguera, X, Rodriguez-Fuentes, LJ, Szöke, I, Buzo, A, Metze, F (2014). Query by Example Search on Speech at Mediaeval 2014. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 351–352).Joho, H, & Kishida, K (2014). Overview of the NTCIR-11 SpokenQuery&Doc Task. In Proc. of NTCIR-11. National Institute of Informatics, Tokyo, (pp. 1–7).NIST. Draft KWS16 Keyword Search Evaluation Plan. https://www.nist.gov/sites/default/files/documents/itl/iad/mig/KWS16-evalplan-v04.pdf . Accessed Feb 2018.Anguera, X, Metze, F, Buzo, A, Szöke, I, Rodriguez-Fuentes, LJ (2013). The spoken web search task. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 921–922).Taras, B, & Nadeu, C (2011). Audio segmentation of broadcast news in the Albayzin-2010 evaluation: overview, results, and discussion. EURASIP Journal on Audio, Speech, and Music Processing, 2011(1), 1–10.ZelenĂĄk, M, Schulz, H, Hernando, J (2012). Speaker diarization of broadcast news in Albayzin 2010 evaluation campaign. EURASIP Journal on Audio, Speech, and Music Processing, 2012(19), 1–9.RodrĂ­guez-Fuentes, LJ, Penagarikano, M, Varona, A, DĂ­ez, M, Bordel, G (2011). The Albayzin 2010 Language Recognition Evaluation. In Proc. of Interspeech. ISCA, Baixas, (pp. 1529–1532).Tejedor, J, Toledano, DT, Lopez-Otero, P, Docio-Fernandez, L, Garcia-Mateo, C, Cardenal, A, Echeverry-Correa, JD, Coucheiro-Limeres, A, Olcoz, J, Miguel, A (2015). Spoken term detection ALBAYZIN 2014 evaluation: overview, systems, results, and discussion. EURASIP, Journal on Audio, Speech and Music Processing, 2015(21), 1–27.Tejedor, J, Toledano, DT, Anguera, X, Varona, A, Hurtado, LF, Miguel, A, ColĂĄs, J (2013). Query-by-example spoken term detection ALBAYZIN 2012 evaluation: overview, systems, results, and discussion. EURASIP, Journal on Audio, Speech, and Music Processing, 2013(23), 1–17.Tejedor, J, Toledano, DT, Lopez-Otero, P, Docio-Fernandez, L, Garcia-Mateo, C (2016). Comparison of ALBAYZIN query-by-example spoken term detection 2012 and 2014 evaluations. EURASIP, Journal on Audio, Speech and Music Processing, 2016(1), 1–19.MĂ©ndez, F, DocĂ­o, L, Arza, M, Campillo, F (2010). The Albayzin 2010 text-to-speech evaluation. In Proc. of FALA. UniversidadeVigo, Vigo, (pp. 317–340).Billa, J, Ma, KW, McDonough, JW, Zavaliagkos, G, Miller, DR, Ross, KN, El-Jaroudi, A (1997). Multilingual speech recognition: the 1996 Byblos Callhome system. In Proc. of Eurospeech. ISCA, Baixas, (pp. 363–366).Killer, M, Stuker, S, Schultz, T (2003). Grapheme based speech recognition. In Proc. of Eurospeech. ISCA, Baixas, (pp. 3141–3144).Burget, L, Schwarz, P, Agarwal, M, Akyazi, P, Feng, K, Ghoshal, A, Glembek, O, Goel, N, Karafiat, M, Povey, D, Rastrow, A, Rose, RC, Thomas, S (2010). Multilingual acoustic modeling for speech recognition based on subspace gaussian mixture models. In Proc. of ICASSP. IEEE, New York, (pp. 4334–4337).Cuayahuitl, H, & Serridge, B (2002). Out-of-vocabulary word modeling and rejection for Spanish keyword spotting systems. In Proc. of MICAI. Springer, Berlin, (pp. 156–165).Tejedor, J (2009). Contributions to keyword spotting and spoken term detection for information retrieval in audio mining. PhD thesis, Universidad AutĂłnoma de Madrid, Madrid, Spain.Tejedor, J, Toledano, DT, Wang, D, King, S, ColĂĄs, J (2014). Feature analysis for discriminative confidence estimation in spoken term detection. Computer Speech and Language, 28(5), 1083–1114.Li, J, Wang, X, Xu, B (2014). An empirical study of multilingual and low-resource spoken term detection using deep neural networks. In Proc. of Interspeech. ISCA, Baixas, (pp. 1747–1751).NIST. The Spoken Term Detection (STD) 2006 evaluation plan. http://berlin.csie.ntnu.edu.tw/Courses/Special%20Topics%20in%20Spoken%20Language%20Processing/Lectures2008/SLP2008S-Lecture12-Spoken%20Term%20Detection.pdf . Accessed Feb 2018.Fiscus, JG, Ajot, J, Garofolo, JS, Doddingtion, G (2007). Results of the 2006 spoken term detection evaluation. In Proc. of SSCS. ACM, New York, (pp. 45–50).Martin, A, Doddington, G, Kamm, T, Ordowski, M, Przybocki, M (1997). The DET curve in assessment of detection task performance. In Proc. of Eurospeech. ISCA, Baixas, (pp. 1895–1898).NIST. Evaluation Toolkit (STDEval) software. https://www.nist.gov/itl/iad/mig/tools . Accessed Feb 2018.Union, IT. ITU-T Recommendation P.563: Single-ended method for objective speech quality assessment in narrow-band telephony applications. http://www.itu.int/rec/T-REC-P.563/en . Accessed Feb 2018.Rajput, N, & Metze, F (2011). Spoken web search. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 1–2).Metze, F, Barnard, E, Davel, M, van Heerden, C, Anguera, X, Gravier, G, Rajput, N (2012). The spoken web search task. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 41–42).Szöke, I, Rodriguez-Fuentes, LJ, Buzo, A, Anguera, X, Metze, F, Proenca, J, Lojka, M, Xiong, X (2015). Query by Example Search on Speech at Mediaeval 2015. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 81–82).Szöke, I, & Anguera, X (2016). Zero-cost speech recognition task at Mediaeval 2016. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 81–82).Akiba, T, Nishizaki, H, Nanjo, H, Jones, GJF (2014). Overview of the NTCIR-11 spokenquery &doc task. In Proc. of NTCIR-11. National Institute of Informatics, Tokyo, (pp. 1–15).Akiba, T, Nishizaki, H, Nanjo, H, Jones, GJF (2016). Overview of the NTCIR-12 spokenquery &doc-2. In Proc. of NTCIR-12. National Institute of Informatics, Tokyo, (pp. 1–13).Schwarz, P (2008). Phoneme recognition based on long temporal context. PhD thesis, FIT, BUT, Brno, Czech Republic.Varona, A, Penagarikano, M, RodrĂ­guez-Fuentes, LJ, Bordel, G (2011). On the use of lattices of time-synchronous cross-decoder phone co-occurrences in a SVM-phonotactic language recognition system. In Proc. of Interspeech. ISCA, Baixas, (pp. 2901–2904).Eyben, F, Wollmer, M, Schuller, B (2010). OpenSMILE—the munich versatile and fast open-source audio feature extractor. In Proc. of ACM Multimedia (MM). ACM, New York, (pp. 1459–1462).Lopez-Otero, P, Docio-Fernandez, L, Garcia-Mateo, C (2016). Finding relevant features for zero-resource query-by-example search on speech. Speech Communication, 84(1), 24–35.Zhang, Y, & Glass, JR (2009). Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams. In Proc. of ASRU. IEEE, New York, (pp. 398–403).Povey, D, Ghoshal, A, Boulianne, G, Burget, L, Glembek, O, Goel, N, Hannemann, M, Motlicek, P, Qian, Y, Schwarz, P, Silovsky, J, Stemmer, G, Vesely, K (2011). The KALDI speech recognition toolkit. In Proc. of ASRU. IEEE, New York, (pp. 1–4).Muller, M. (2007). Information retrieval for music and motion. New York: Springer.Szöke, I, Skacel, M, Burget, L (2014). BUT QUESST 2014 system description. In Proc. of MediaEval. Ruzica Piskac, New Haven, (pp. 621–622).BrĂŒmmer, N, & van Leeuwen, D (2006). On calibration of language recognition scores. In Proc of the IEEE Odyssey: The speaker and language recognition workshop. IEEE, New York, (pp. 1–8).BrĂŒmmer, N, & de Villiers, E. The BOSARIS toolkit user guide: Theory, algorithms and code for binary classifier score processing. Technical report. https://sites.google.com/site/nikobrummer . Accessed Feb 2018.Meinedo, H, & Neto, J (2005). A stream-based audio segmentation, classification and clustering pre-processing system for broadcast news using ANN models. In Proc. of Interspeech. ISCA, Baixas, (pp. 237–240).Morgan, N, & Bourlard, H (1995). An introduction to hybrid HMM/connectionist continuous speech recognition. IEEE Signal Processing Magazine, 12(3), 25–42.Meinedo, H, Abad, A, Pellegrini, T, Trancoso, I, Neto, J (2010). The L2F broadcast news speech recognition system. In Proc. of FALA. UniversidadeVigo, Vigo, (pp. 93–96).Abad, A, Luque, J, Trancoso, I (2011). Parallel transformation network features for speaker recognition. In Proc. of ICASSP. IEEE, New York, (pp. 5300–5303).Diez, M, Varona, A, Penagarikano, M, Rodriguez-Fuentes, LJ, Bordel, G (2012). On the use of phone log-likelihood ratios as features in spoken language recognition. In Proc. of SLT. IEEE, New York, (pp. 274–279).Diez, M, Varona, A, Penagarikano, M, Rodriguez-Fuentes, LJ, Bordel, G (2014). New insight into the use of phone log-likelihood ratios as features for language recognition. In Proc. of Interspeech. ISCA, Baixas, (pp. 1841–1845).Abad, A, Ribeiro, E, Kepler, F, Astudillo, R, Trancoso, I (2016). Exploiting phone log-likelihood ratio features for the detection of the native language of non-native English speakers. In Proc. of Interspeech. ISCA, Baixas, (pp. 2413–2417).RodrĂ­guez-Fuentes, LJ, Varona, A, Peñagarikano, M, Bordel, G, DĂ­ez, M (2014). High-performance query-by-example spoken term detection on the SWS 2013 evaluation. In Proc. of ICASSP. IEEE, New York, (pp. 7819–7823).Vesely, K, Ghoshal, A, Burget, L, Povey, D (2013). Sequence-discriminative training of deep neural networks. In Proc. of Interspeech. ISCA, Baixas, (pp. 2345–2349).Ghahremani, P, BabaAli, B, Povey, D, Riedhammer, K, Trmal, J, Khudanpur, S (2014). A pitch extraction algorithm tuned for automatic speech recognition. In Proc. of ICASSP. IEEE, New York, (pp. 2494–2498).Povey, D, Hannemann, M, Boulianne, G, Burget, L, Ghoshal, A, Janda, M, Karafiat, M, Kombrink, S, Motlicek, P, Qian, Y, Riedhammer, K, Vesely, K, Vu, NT (2012). Generating exact lattices in the WFST framework. In Proc. of ICASSP. IEEE, New York, (pp. 4213–4216).Garcia-Mateo, C, Dieguez-Tirado, J, Docio-Fernandez, L, Cardenal-Lopez, A (2004). Transcrigal: A bilingual system for automatic indexing of broadcast news. In Proc. of LREC. ELRA, Paris, (pp. 2061–2064).Stolcke, A (2002). SRILM—an extensible language modeling toolkit. In Proc. of Interspeech. ISCA, Baixas, (pp. 901–904).Lopez-Otero, P, Docio-Fernandez, L, Garcia-Mateo, C (2016). GTM-UVigo systems for Albayzin 2016 search on speech evaluation. In Proc. of Iberspeech. Springer, Berlin, (pp. 65–74).Chen, G, Khudanpur, S, Povey, D, Trmal, J, Yarowsky, D, Yilmaz, O (2013). Quantifying the value of pronunciation lexicons for keyword search in low resource languages. In Proc. of ICASSP. IEEE, New York, (pp. 8560–8564).Pham, VT, Chen, NF, Sivadas, S, Xu, H, Chen, I-F, Ni, C, Chng, ES, Li, H (2014). System and keyword dependent fusion for spoken term detection. In Proc. of SLT. IEEE, New York, (pp. 430–435).Can, D, & Saraclar, M (2011). Lattice indexing for spoken term detection. IEEE Transactions on Audio, Speech and Language Processing, 19(8), 2338–2347.Miller, DRH, K

    Using genomics to understand the origin and dispersion of multidrug and extensively drug resistant tuberculosis in Portugal.

    Get PDF
    Portugal is a low incidence country for tuberculosis (TB) disease. Now figuring among TB low incidence countries, it has since the 1990s reported multidrug resistant and extensively drug resistant (XDR) TB cases, driven predominantly by two strain-types: Lisboa3 and Q1. This study describes the largest characterization of the evolutionary trajectory of M/XDR-TB strains in Portugal, spanning a time-period of two decades. By combining whole-genome sequencing and phenotypic susceptibility data for 207 isolates, we report the geospatial patterns of drug resistant TB, particularly the dispersion of Lisboa3 and Q1 clades, which underly 64.2% and 94.0% of all MDR-TB and XDR-TB isolates, respectively. Genomic-based similarity and a phylogenetic analysis revealed multiple clusters (n = 16) reflecting ongoing and uncontrolled recent transmission of M/XDR-TB, predominantly associated with the Lisboa3 and Q1 clades. These clades are now thought to be evolving in a polycentric mode across multiple geographical districts. The inferred evolutionary history is compatible with MDR- and XDR-TB originating in Portugal in the 70's and 80's, respectively, but with subsequent multiple emergence events of MDR and XDR-TB particularly involving the Lisboa3 clade. A SNP barcode was defined for Lisboa3 and Q1 and comparison with a phylogeny of global strain-types (n = 28 385) revealed the presence of Lisboa3 and Q1 strains in Europe, South America and Africa. In summary, Portugal displays an unusual and unique epidemiological setting shaped by >40 years of uncontrolled circulation of two main phylogenetic clades, leading to a sympatric evolutionary trajectory towards XDR-TB with the potential for global reach

    A method for the extraction of phonetically-rich triphone sentences

    Get PDF
    A method is proposed for compiling a corpus of phonetically-rich triphone sentences; i.e., sentences with a high variety of triphones, distributed in a uniform fashion. Such a corpus is of interest for a wide range of contexts, from automatic speech recognition to speech therapy. We evaluated this method by building phonetically-rich corpora for Brazilian Portuguese. The data employed comes from Wikipedia’s dumps, which were converted into plain text, segmented and phonetically transcribed. The method consists of comparing the distance between the triphone distribution of the available sentences to na ideal uniform distribution, with equiprobable triphones. A greedy algorithm was implemented to recognize and evaluate the distance among sentences. A heuristic metric is proposed for pre-selecting sentences for the algorithm, in order to quicken its execution. The results show that, by applying the proposed metric, one can build corpora with more uniform triphone distributions

    Unraveling Mycobacterium tuberculosis genomic diversity and evolution in Lisbon, Portugal, a highly drug resistant setting.

    Get PDF
    BACKGROUND: Multidrug- (MDR) and extensively drug resistant (XDR) tuberculosis (TB) presents a challenge to disease control and elimination goals. In Lisbon, Portugal, specific and successful XDR-TB strains have been found in circulation for almost two decades. RESULTS: In the present study we have genotyped and sequenced the genomes of 56 Mycobacterium tuberculosis isolates recovered mostly from Lisbon. The genotyping data revealed three major clusters associated with MDR-TB, two of which are associated with XDR-TB. Whilst the genomic data contributed to elucidate the phylogenetic positioning of circulating MDR-TB strains, showing a high predominance of a single SNP cluster group 5. Furthermore, a genome-wide phylogeny analysis from these strains, together with 19 publicly available genomes of Mycobacterium tuberculosis clinical isolates, revealed two major clades responsible for M/XDR-TB in the region: Lisboa3 and Q1 (LAM).The data presented by this study yielded insights on microevolution and identification of novel compensatory mutations associated with rifampicin resistance in rpoB and rpoC. The screening for other structural variations revealed putative clade-defining variants. One deletion in PPE41, found among Lisboa3 isolates, is proposed to contribute to immune evasion and as a selective advantage. Insertion sequence (IS) mapping has also demonstrated the role of IS6110 as a major driver in mycobacterial evolution by affecting gene integrity and regulation. CONCLUSIONS: Globally, this study contributes with novel genome-wide phylogenetic data and has led to the identification of new genomic variants that support the notion of a growing genomic diversity facing both setting and host adaptation

    Genomic epidemiology of Mycobacterium tuberculosis in Santa Catarina, Southern Brazil

    Get PDF
    Mycobacterium tuberculosis (M.tb), the pathogen responsible for tuberculosis (TB) poses as the major cause of death among infectious diseases. The knowledge about the molecular diversity of M.tb enables the implementation of more effective surveillance and control measures and, nowadays, Whole Genome Sequencing (WGS) holds the potential to produce high-resolution epidemiological data in a high-throughput manner. FlorianĂłpolis, the state capital of Santa Catarina (SC) in south Brazil, shows a high TB incidence (46.0/100,000). Here we carried out a WGS-based evaluation of the M.tb strain diversity, drug-resistance and ongoing transmission in the capital metropolitan region. Resistance to isoniazid, rifampicin, streptomycin was identified respectively in 4.0% (n = 6), 2.0% (n = 3) and 1.3% (n = 2) of the 151 studied strains by WGS. Besides, resistance to pyrazinamide and ethambutol was detected in 0.7% (n = 1) and reistance to ethionamide and fluoroquinolone (FQ) in 1.3% (n = 2), while a single (0.7%) multidrug-resistant (MDR) strain was identified. SNP-based typing classified all isolates into M.tb Lineage 4, with high proportion of sublineages LAM (60.3%), T (16.4%) and Haarlem (7.9%). The average core-genome distance between isolates was 420.3 SNPs, with 43.7% of all isolates grouped across 22 genomic clusters thereby showing the presence of important ongoing TB transmission events. Most clusters were geographically distributed across the study setting which highlights the need for an urgent interruption of these large transmission chains. The data conveyed by this study shows the presence of important and uncontrolled TB transmission in the metropolitan area and provides precise data to support TB control measures in this region.publishersversionpublishe
    corecore